Refine your search:     
Report No.
 - 
Search Results: Records 1-5 displayed on this page of 5
  • 1

Presentation/Publication Type

Initialising ...

Refine

Journal/Book Title

Initialising ...

Meeting title

Initialising ...

First Author

Initialising ...

Keyword

Initialising ...

Language

Initialising ...

Publication Year

Initialising ...

Held year of conference

Initialising ...

Save select records

Oral presentation

Performance evaluation of a modified communication-avoiding generalized minimal residual method on many core platforms

Idomura, Yasuhiro; Ina, Takuya*; Mayumi, Akie; Yamada, Susumu; Matsumoto, Kazuya*; Asahi, Yuichi*; Imamura, Toshiyuki*

no journal, , 

We propose a modified communication-avoiding generalized minimal residual (CA-GMRES) method, which reduces both computation and memory access by 30% with keeping the same CA property as the original CA-GMRES method. These numerical properties, less communication and computation with higher arithmetic intensity, are promising features for future exascale machines with limited memory and network bandwidths. The modified CA-GMRES method is applied to a large scale non-symmetric matrix in an implicit solver of the gyrokinetic toroidal five dimensional Eulerian code GT5D, and its performance is estimated on the Oakforest-PACS (KNL). The numerical experiment shows that compared with the generalized conjugate residual method, computing kernels are accelerated by 1.5x, and the cost of data reduction communication is reduced from 12.5% to 1% of the total cost at 1,280 nodes.

Oral presentation

Performance property of preconditioned Chebyshev basis CG solver for multiphase CFD simulations

Mayumi, Akie; Idomura, Yasuhiro; Ina, Takuya*; Yamada, Susumu; Imamura, Toshiyuki*

no journal, , 

To improve the convergence property of the communication avoiding conjugate gradient (CA-CG) method is needed for applying it to ill conditioned problems such as the pressure Poisson equation in the multiphase CFD code JUPITER. In the CA-CG method, one can avoid more communication by increasing the number of CA steps. However, this makes the CA-CG method less robust against numerical errors. To resolve this problem, we apply the Chebyshev basis CG (CBCG) method to JUPITER.

Oral presentation

Acceleration of turbulent wind simulation using locally mesh-refined Lattice Boltzmann Method

Onodera, Naoyuki; Idomura, Yasuhiro

no journal, , 

A real-time simulation of the environmental dynamics of radioactive substances is very important from the viewpoint of nuclear security. We develop a CFD code based on a Lattice Boltzmann Method (LBM) with a block-based Adaptive Mesh Refinement (AMR) method. The code is tuned to achieve high performance on the latest Pascal GPU architecture. By introducing a temporal blocking technique, the number of the MPI communications is significantly reduced.

Oral presentation

Large Eddy Simulation of thermal atmospheric environment in urban boundary layer

Inagaki, Atsushi*; Onodera, Naoyuki; Kanda, Manabu*; Aoki, Takayuki*

no journal, , 

Since urban atmospheric environment is strongly controlled by multi-scale flow dynamics, simulation of the urban atmospheric boundary layer requires fine grid spacing and huge computational domain. We accomplished to simulate an urban atmospheric boundary layer using a Large Eddy Simulation model running on TSUBAME super computing system. Flow characteristics within and above a building canopy were successfully examined.

Oral presentation

An AMR framework for realizing effective high-resolution simulations on multiple GPUs

Shimokawabe, Takashi*; Aoki, Takayuki*; Onodera, Naoyuki

no journal, , 

Recently grid-based physical simulations with multiple GPUs require effective methods to adapt grid resolution to certain sensitive regions of simulations. In the GPU computation, an adaptive mesh refinement (AMR) method is one of the effective methods to compute certain local regions that demand higher accuracy with higher resolution. The AMR method on the GPU supercomputers is, however, complicated and it is necessary to apply various optimizations suitable for the GPU supercomputers in order to obtain high performance. To develop the applications using the AMR method on the GPU supercomputers effectively, we are developing a block-based AMR framework for grid-based applications written in C++ and CUDA. Programmers just write the stencil functions that update a grid point on Cartesian grid.

5 (Records 1-5 displayed on this page)
  • 1